-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client, server systemd units: make Restart=always truly respected #184
Conversation
Surprisingly, Restart=always may not _always_ restart the unit if it restarts too fast. Set a combination of options which should make systemd truly restart innernet always. See https://unix.stackexchange.com/q/289629/352972. The `RestartSec=60` is the main and important one which would prevent systemd from ever failing to restart innernet in the default settings (because with it it would never exceed the default limit of 5 restarts in 10 seconds). `StartLimitIntervalSec=0` option is a complementary one for explicitly disabling the logic, and may be removed from this PR if deemed unnecessary. Should fix tonarino/portal#1441 (link to issue in private repository).
server/innernet-server@.service
Outdated
|
||
[Service] | ||
Type=simple | ||
Environment="RUST_LOG=info" | ||
ExecStart=/usr/bin/innernet-server serve %i | ||
Restart=always | ||
# When the daemon exits, wait this amount of secs before restarting. Prevents innernet from | ||
# start-looping each 100ms for example when there is a problem reaching the server. | ||
RestartSec=60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For client, the restart interval of 60 secs is somewhat natural as it is equal to the fetch interval.
But I don't have an idea what the server restart interval should be. Perhaps a bit less?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think it should be closer to 1s than 60s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this! Make sense especially to have the client try harder to keep the interface updated :).
server/innernet-server@.service
Outdated
@@ -2,12 +2,18 @@ | |||
Description=innernet server for %I | |||
After=network-online.target nss-lookup.target | |||
Wants=network-online.target nss-lookup.target | |||
# Disable systemd's unit start rate limiting logic, which could override Restart=always. | |||
# See https://unix.stackexchange.com/q/289629/352972 | |||
StartLimitIntervalSec=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure we want to modify this unit much at all - if the server fails it's more likely to be a permanent problem and the typical systemd "give up" logic makes more sense, in my opinion.
Personally, I'd suggest we should only slightly increase the RestartSec
value and leave it otherwise as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure we want to modify this unit much at all - if the server fails it's more likely to be a permanent problem and the typical systemd "give up" logic makes more sense, in my opinion.
That's a good point. I've made changes to systemd units more conservative in a fixup commit to resolve this and the 2 other comments.
client/innernet@.service
Outdated
|
||
[Service] | ||
Type=simple | ||
ExecStart=/usr/bin/innernet up %i --daemon --interval 60 | ||
Restart=always | ||
# When the daemon exits, wait this amount of secs before restarting. Prevents innernet from | ||
# start-looping each 100ms for example when there is a problem reaching the server. | ||
RestartSec=60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's ok with you, let's make this 10 rather than 60 since it may be a temporary issue and we should be retrying harder if a failure has caused innernet to not fetch peers and update the interface.
Surprisingly, Restart=always may not always restart the unit if it restarts too fast.
Set a combination of options which should make systemd truly restart innernet always.
See https://unix.stackexchange.com/q/289629/352972.
The
RestartSec=60
is the main and important one which would prevent systemd from ever failingto restart innernet in the default settings (because with it it would never exceed the default
limit of 5 restarts in 10 seconds).
StartLimitIntervalSec=0
option is a complementary one for explicitly disabling the logic, andmay be removed from this PR if deemed unnecessary.
Should fix tonarino/portal#1441 (link to issue in private repository).
I've tested this successfully with the client unit file (on an artificial problem though):